Skip to content

Do not crash if filesystem can't fsync #1262

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

l0rinc
Copy link

@l0rinc l0rinc commented Apr 4, 2025

Cherry-pick of bitcoin-core@d42e63d

Copy link

google-cla bot commented Apr 4, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

This code moved, re-apply the fix elsewhere.

See
- bitcoin/bitcoin#10000
- bitcoin-core/leveldb-old#16

Original change by Nicolas Dorier <nicolas.dorier@gmail.com>, ported to leveldb 1.22 by Wladimir J. van der Laan.
Copy link
Contributor

@felipecrv felipecrv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really bad. Retry with exponential backoff on EINVAL and if after multiple attempts you still get the error, return the error so the caller knows the data hasn't been committed to storage media.

@laanwj
Copy link

laanwj commented Apr 29, 2025

My understanding of the issue was that the reason to make this change was that some file systems fail (CIFS, at the time) instantly when fsync() is called on directories. As this is an inherent property of the file system, retrying wouldn't change anything. Returning the error to the caller is what we want to avoid as the software should work with these file systems, even if this means being able to make fewer guarantees.

That said, this was added as an application-specific work-around for a 2018 (no, even older) bug in CIFS it may not be needed anymore, nor necessary to upstream.

@felipecrv
Copy link
Contributor

Returning the error to the caller is what we want to avoid as the software should work with these file systems, even if this means being able to make fewer guarantees.

Understandable. The problem is that this solution makes fewer guarantees on every file system, not just CIFS. fsync for CIFS becoming a no-op in the kernel really is the best solution.

@l0rinc l0rinc marked this pull request as draft April 30, 2025 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants